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ABSTRACT ^ ^ 

Fourth, seventh, and ninth grade students in , ^\ 

Elementary Secondary Education Act (ESEA) Title I prqgrams were 
tested Mith the reading comprehension subtests of the Comprehensive ' 
Tests of Basic Skills, at each df two levels: on-level for each 
respective grade, and an easier cut-of-level form* Approximately half 
of thes^ students were found to be scoring at or helov the cha^nce 
level for the on-level test^. It was judged that in such 
circumstances it is preferable to use out-of-level tests, 
particularly at the seventh and niRth grade levels. (CHh) 
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Impllcatlon,8^oF Out-of-Level Testing 
' forESEA Title I Students 

Int roduct ion ' ? - 

Researchers have advised administratoVs and evaluators of ESEA 
Title I programs to test students at their achievement level and not on 
the basis of the students' grade level (Horst and Fagan 1976; Roberts 1976) 
Other researc?hers assert that out-of-level testing is a questionable 
procedure until test publishers develop out-of-level norms which would 
aid in obtaining meaningful, derived scores (Long, Schaffran aai__ 
Kellogg 1^77). Most of the major achievement test publishers provide 
a common metric by which scores which have been obtained through 

jout-of -levej^est ing may be converted to the on-level equivalents. 
With such a common metric, test publishers indicate that 6ut-of-level 

..test in'g .may be conduct^ed and the results may be converted to the 
appropriate grade level of the students ' 

During the 1977-78 3chAol year the ' Elementary and Secondary 
Education Act (ESEA) Title I compensatory reading program was provided 
to approximately 4000 students in grades K-12^ of the Tucson Unified . - 
School District. In the fall of 1977*a study was conducted with' 
Title! students enrolled in the fourth, seventh and ninth grades. 
The purpose of the study was: 1) to compare the achievement of students 
tested with the on-level and out-pf-level reading fcompreHens ion subtests 
of .the Cc^rehensive Teat^ of Basic Skills, Form S, (CTBS/S); 2) to 
deterra'Jtiei^if the on-level or out-of-level form of\the CTB§/S Reading 
Comprehension , subtests .was more suitable for Title I* students; 3) to 



ascertain if there were significant differences between the out-of-level 
and on-level test scores when the scores were converted to a Qomraon 
metric, the CTBS/S Expanded Standard Score Scale; 4) to investigate 
if there were trends in the data which indicated a linear or curvilinear 
relationship. 

* 

By means of the CTBS/S Expanded Standard Scor^ Scale, out-of-level 
testing scores could be converted to the appropriate grade level scores.. 
Fundamentally, the Expanded Standard Score Scale--an equal interval, 
normalized scale--was developed following Thurstone's Absolute Scaling 
Method as described by Gulliksen (1950). 

Classic test theory .has been fQrmulate.d eitlier in terms of triie 
score and error or in terms of a definition of parallel test forms. 
Recently,' the concept of domain sampling has appeared in measurement 
literature. According to Thorndike (1971) page 9r "A somewhat different 
conception has been offered in recent years, the conception of a domain 
of admissible tasks from which the test was drawing one sample. 
Reliability is then conceived as the accuracy with which the sample 
represents the complete domain from which it was drawn.** 

/ 

When the samplers are small, the precision of measurement is 
poor. Accordingly, the pn^portion of test material on which students 
should spend their time s^uld not drop too low. Moreover, increasing 
the sample size should not only increase the precision of measurei^at , it 
should also reduce random sampling errors. 

Methpdology • ' 

In the fall of 1977 the CTBS// Reading Comprehension subtests 

were administered to a selected sample of 89 students enrolled in the 

ESEA Title I project in the fourth, seventh and ninth grades. At the. 
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first testing session, one-half of each group was tested with' the 
on-level test and the other half w/is tested with the out-of-level test. 
Within one week of the first testing, a second te'^tlng session was held 
in which the groups were reversed so as to savoid any bia^Tesulting 
from the sequence of testing. The numbier of students and the levels of 
the CTBS/S involved in the testing are presented in Table 1. 



Insert Table' 1 about here 



y 

The CTBS/S, Levels 1-4, is a battery of seven tests measuring 

three basic skills areas: Reading, Language and Mathematics (CTBS/S 

Test Coordinator's Handbook 1976). The skills areas were classified ^ 

using Bloom's Taxonomy of Educational Objectives (Appendix A). In the 

test development, efforts we^re made to reduce racial and ethnic bias. 

% 

The K-R 20 reliabilities at each grade level for vocabulary, comprehension 
and total scores are almost all- above .90 with standard errors of 
measurement from .25 to 1.01 in grade equivalent units. Moreover, it 
appeared that systematic procedures were followed in test development 
to ensure content validity. The CTBS/S Reading Comprehension subtests. 
Levels 1-4 are composed of 45 items and each item in Levels 1-4 contains 
a multiple choice involving four alternatives. The Reading Comprehensiotvt 
Passages subtest in Level Cu included 18 items, each item with a multiple 
choice of four altematived. For an outline of the graded levels* 
recommended for administration, of the Cp^S/S, refer to Table 2. 



Insert Table 2 about here 



Before the statistical analysis, raw scores were converted 
to the CTBS/S Expanded Standard Score *Scale .^^alculat j 



lions for^ the / 



present stwdy were performed with expanded standard scores unless 
otherwise noted. As the expanded standard scbres 'are a normalised * 

9 

scale with assumed equal intervals, it was )bel'ieved this metric was . 
more appropriate for statistical -analysis^ This choice o'f a metric 
is in agreement with the technical advice to ESEA Title I evaluators 
(Tallmadge and Wood^l976). When it was desired to convert an out-o'f-level 
test statistic up to the appropriate on-level statistic, the raw scores 
were"converted to expanded standard scores and the desired statistic* 
was computed wi th^xpanded standard scores. Then, the appropriate 
grade level table was referred to, aad by meaas of the ,approp;ri&te 
table, the expanded standard score was used to ascerta^in ttie gr^^de level 
raw sco*re, percentile, stanine^or grade equivaLent, • ; 

In summary, the purposes of .the present s*tudy were the following; 

1) to compare the a'6hi€5vement of .students, tested with the on-level and 

; ^ ' , ^ / ^ 

out-of-level reading comprehension subtests of the CTBS/S; 2) to 

determine if the on-level or out-of-level subtests were suitable by 
investigating the chance level, floor an(f ceiling effects', test suit- 
ability and a reliability index; 3) to investigate any significant' " 

I .... % 

differences between mean expanded standard scares; A) to study, any 

linear or curvilinear trends in the data. 

/ ■* 

> . '■ • . . 

"Results ' . 0 

Iv, Descriptive statistics were studied 'to compare 'the achieye- 
ment of Title I stude.nts who were aciminis ter6d both an 6n-leyel and 
out-of-level CTfiTS/S^ Reading Comprehension subtest. Summary statistics 
are* pi'esented in'Table 3T- Descriptive statistics in expanded standard 
scores and . raw scores are displayed in Append i^jc^B. 



Insert Table 3 about here 

X 



At the ninth grade level and seventh grade level,. the 
out-of-level testing indicated higher percentiles and grade equivalent's 
than the on-level testing. In the fourth gra'de level, the percentiles* 
and grade equivalents appeared lower for the out-of-leyel testing than 
the on-level testing. Of course, one would not expect to find exactly 
the same mean^(X), percentile or grade equiva.lent even if the students 
were tested with exactly the same test under optimal. testing conditions. 
Variation in testing results wbuld.be expected under the best conditions, 

2, To determine if^h^^ on-level or out-of-l«vel Reading 
Comprehension subtest was more suitable fo.r Title I students, the 
following four^factors were investigated: a) chance level, b) floor 
and ceiling effect^s, c) testis suitability and d) test's reliability, 

a. The chance level of a test is a phenomen6n whicji 
should be investigated when testing Title I students. Since Title I 
students are selected because of thd\r need fbr Title I reading programs, 
the proportion of Title I students scoring at chance level will often 
reach unacceptable levels,* Whian the number df students scoring at 
chance level is unreasonably large, this could be an indication that 
out-of-level'^ testing is a better procedure than on-level testing, 
GuUiksen (1950, p. 263) provides guidelines for 
investigating the chance level of tests. The. average chance score. 
(X^), the total number of i^ems in a test (K) , and the standard 
deviation of the distribut ion of chance scopes (SD^) are three quantities 
which assist the evaluator in investigating the meaningful score range 
for a test. The following formulas are used to compute the avera^^ 



chance score and the standard deviation of chance scores: 



SD^ « -\/ K(A>1 ) 
' A . . 

where K = the total number of items on a test; A = the number of 
alternative answers^for each item. When applied to the data in, the 
present study, a Score obtained on Levels 1-4 tbat is less th^n 15 or 
on Level C that is less than 7 would fall within the upper limits of 
one standard^ deviation of the distribution of chance scores (Table 4). 
"Any score within one or two standard' deviations (SD^)' of a chance 
score should not be interpreted as signifying any knowledge -of the ^ * 
subject matter of the examination" (Gulllksen 1950, p. 263). 



Insert Tables '4 and 5 about here 
\r'---- 

When Title I students were fested wi^h the on-level 
tests, exceedingly large proportions of students at each grade level 
scored within the Jimits of one standard deviation of the distribution 
of chance scores. The proportion of Title I students who scored *at the 
chan^^ level dropped noticeably when they were tested with. the out-of-level 
test/ The lowering of the number o'f students scoring at chance level should 
provide a larger sampling of stud'ents* reading' skills, and this in turn 
should result in a more accurate estimate of the students' reading 
ability. Tab-le 5 ^displays- the percent of students scoring at chance 
level. 

b» Floor and ceiling effects were investigated using 
Roberts' (1976) guidelines. When studying f^oor effects, according to 
Roberts the following guidelines may be used\^ if the mean (X) is higher 




than the median (Md) by about one-third of a standard deviation (SD), a 
floor effect tnay have been encountered. If the mean is lower than the 
median by about one-third of a standard deviation, a ceiling effect may 
have been encountered. At the fourth gra^ level a floor effect was 
discovered in the data for» CTBS/S Level 1 and a ceiling effect Jor CTBS/S 
Level C. It appeared that at the fourth grade level, neither tl>e on-level 
nor the out-of-level CTBS/S reading comprehension subtest was free of 
Qoor or ceiling effects. Refe^^o Table 6 for the results of the 
investigation of ceiling and floor effects. ^ 

* - - - . 5 

Insett Table 6 about here ^ ' ,y 



.c. Tp determine the suitability of a test, some criteria 
are nee'ded. Roberts (1976) suggests Vhat inmost instances the level 
of a t'est is suitable when the mean raw score of the group is equal to 
or above a third of the maximum score, and somewhat less than three- 
quarters of the maximum* Refer to Table 7 for the results of applying 
Roberts' (1976)^ guidelines to Title I students. 



Insert 'Table 7 about here 
& - 



Three of the ^ests administered ta Title I students were 
found to be inappropriate when suitability guidelines were applied: 
Grade 9, Level 4; Grade 7, Level 3; and Grade 4, Level C. ^ 

d. One index for estimating a test's reliability is 
described by Roberts' (1976)." Tests 'are constructed by the publishers 
so that the median score at the appiropriate grade level is well above 
half the number of items in the test. Thus.'^for the average class, a 
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ceiling effect is more likely than a floor effect . According to Roberts 
(1976), "The highest reliability of a test, is achieved when the Students, 
on the average, get slightly more than half the items correct." 



t 

Insert Table 8 about here 



The median number of correct responses vas increaserd in 
all cases where o^t-of-leVel testing was conducted (Table 8). This 
should increase the ^test's reliability. , 
, ' 3,^ To determine "if there were significant differences between 

testing results in expanded standard scores, the out-of-level m^ans 
were compared with the on-l^vel means usifig correlated t-te'Sts. At the 
ninth and seventh grade levels the differences l>etween out-of-level and 
V on-level test^cores fell within the range of sampling error. Therefore 
the null hypothesis- that i>oth samples of students' abilities' were from 
the same population could not be rejected. The fourth grade out-of-level' 
nndpp)n-level testing appeared significantly different, but further 
invest ^g^^ ion revealed both a^ floor effect with tNhe*on-level testing ^ ' 
and a ceiling effect with the out-of-level testing. Accordingly ^little 
confidence- coul-d tie placed in the statistical significance found at the 
fourth grade level l>ec^use the score distributions at the fourth grade 
level were either inflated or depresse<i. Table 9 presents t-tests ' 
comparing out-of-level test meafts with on-level test^meana.- Vor t-t€^^ts 
with raw scores, refer to Appendix C. - ^ 



Insert Table 9 about here 



Pearson product-moment correlations were computed between 
the out-of-level scores and the^on-level scores for each grade level 
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to determine the shared variation between the two testing levels. The 

.proportion of ihared variatK^ (r2) was surprisingly low in the ninth 

atvd seventh grade levels although in the fourth grade level the propor- 

tion of shared variance more than doubled that of the seventh grade ar\d 

was almost five times that of the ninth grade level. Correlations of 

k 

out-of-level test scores with on-level test scores* are displayed in 
Table 10. For correlations using expanded standard scot^ and raw 
scores, refer to Appendix D. 

Insert Table 10 about here 



The increase in proportion of shared variance for fourth 
grade students may be relkted ^to the decrease in the proportion of 
students scoring ^at the chance leVel in jJhe fourth grade, although 
there are other possible explanations. If the number of students 
scoring at the chance level • is reduced, this^ could begone factor which 
increases the accuracy of the measurement and contributed to a higher 
correlat?ion between the two levels of the test at the fourth grade 
levels ^ 

4. The possibility of curvilinear trends in the data was 
investigated following a procedure outlined by Kerlinger (1973). The 
on-level test scores were regressed on the out-of-level test scores at 
escli of the three grade levels. /This was accompl ished- separately for 
the expanded standard scores and thk raw scores. The out-&f-level test 
scores were squared and entered into the regression equation to detennine 
if the variance accounted^ for in the on-.l^vel scores was significantly* • 
increased: ^ , ' - ♦ 

• y = a + bx + bx2 . 
* uhej^ y « the predicted on-leveU scores, b' • the b-Weight applied to the 
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predictor scores-, x - out-of-level predictor scores and - the squared 

> • 

out-of-level predictor scores. In the present analysis the trends in 

raw scores and'expanded standard scores were studied. No significant 

departures *from It^arlty were* indicated in the d^ta at any grade level. 
' 4 

Therefore, the relationships are best described at each grade level by 

\ 

a linear e<J\jation. 



\Summarv and Discussion . ' 

The purposes of the present study* was: 1) to compare the 
achievement' of Title I studefrits who were administered both ax} on-level 
and out-of-level CTBS/S ReaHing Comprehension subtest, 2) to determine 
if the\op-level oj- outrof-lWel Reading Comprehension' subtest was mor^^ 
suitable for Title I students, 3) to ascertain if there were, significismt 
differences between the out-of-le'^el and on-level mean scqres when the 
scores were converted to the Expanded Standard Score Scale 'and 4) to 
investigate if there were linear or chrvilinear trends. in the data.^ 

1. The achievement of Title I students ^1^^^^^ administered , 



out-^of-level and^ on t level tests was compared after the raw scores were - 

converted to expanded standard scores and appropriate statistics 

computed. The out-of-level means were converted to the appropriate 

* • f 

grade levels. Students at the seventh and ninth grade levels attained 

higher means, percentiles and .grade equivalents when the expandeci^ 
standard score means were converted to the appropriate grade level 
statistic. Fourth gr-ade students* mean' expanded standard scpre was^ ^ , 
higher for the on-level testi-ng. It was pointed out, however, that 
floor effects occurred in the on-letvel test and ceilifg effects occurred 
in the out-of-level testing at the foi^xth gra^de level. Because of the 
floor and ceiling effects, it'jwas not possible to determine the compara- 
bility of out-of-level and on-level means at the fourth grade level 

12 



beeause-t-he-s^xo^re-dtat-r-tbut-ion was eithe^ artificially depressed or 

spuriously inflated. 

* «^ 

2. To determine if the out-of-level or the on-level tests ' 
were more suitable for Title I students, the data were analyafed to 
determine: a) if a large number of students were at chance level, 
b) if there were floor or ceiling effects,' c) if the levels of the 
tests were appropriate and d) if the median score indicated a reliable 
test, u ^ '* ' ' ' 

^ Large percentage's of ^tudents scored within the chance 

level on the on-level test. When the same students were tested with 
out-of-level tests, the chance level dropped noticeably. By lowering 
the 6hance level, a larger sample/ of ^he students* abilities should be 

■ : / 

obtained and this should result in a more precise measurement of the 

students' reading ability. ^ 

^ Using Roberts' (1976) criteria to investigate floor and 

ceiling effects, it was determined that floor effects occurred in the 

on-level test and that ceiling effects- occurred with the out-of-level 

test at the fourth ^ade level. ' Neither floor n0r ceiling effects were 

indicated ^in the data at the seventh and ninth grades. 

Roberts (1976) provides, guidelines for- determining when a 

test is suitable in terms of difficulty. When these guidelines were 

applied to the tests administered on-level and out-of-level , the tests 

at the fourth grade level were found to be upfiijiitable. Moreover, .the 

on-level tests at' the seventh and ninth grades appeared to be unsuitabl 
♦ * 

The median number of correct responses was compared with 
the median number of possible responses to obtain a general indication 
of the testes reliability. Tl^median score at the appropriate grade 
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• 






level should be above hat^f^the number of Items In the test (Roberts 1976). 








When students were administered the out-of-»level test,' the median number 








of correct responses was Increased over the median correct responses of 








the on-level test. According to Roberts (1976) this should indicate an 








increase in reliability. • , . • 








3. To determine if there were significant differences between 


• 






*the niean expanded standard scores of the out-of-level tests and the 








mean expanded standard scores of the on^-level tests, correlated t^tests 


* 






were computed. At the seventh and ninth grades no significant differences 








were found between the out-of-level and on-level mean scores. The • , 


• 






differences between the out-of-level means and the on-level means could 








be th^ result of sampling error. Thus the information obtained from 








out-of-level testing did not ^ppear to be significantly different from 








the information obtained from on-level testing. At the fourth grade 








level a ceiling effect artificially depressed the out'-of-level test 








scores and a floor effect spuriously inflated the on-level test scores. 








This is probably the major faptor in the statistical significance found . 








between on-level, and out-of-level means at the fourth grade level. 








Pearson product -moment correlation coefficients were computed 






: 


between out-of-level scores and on-level scores at each grade level to 








determine the shared variation between the two test levels. The 








proportion of shared variance was surprisingly low at the ninth and 








seventh grade levels • The proportion of varJLance shared variance was 








much greater at the f<)urth grade level. A major factor in the increase 






•■ 


in shared variance may have been the reduction of the chance level at* the 

\ • * ' 








fourth grade Jevel and the resulting increase in precision of measurement. 
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4. The possibility of curvilinear trends was investigated 

( « 

"following a procedure described by Kerlinger (1973), but no such trends 
were found. The data at each grade l^vel could be described best by a 
linear equation. 

Conclusions 

What are the implications of out-of-level testing for ESEA 
Title I students? Two major questions are often raised by ESEA Title I 
'administrators and evaluators: 

1) Can out-of-level test results be conyerted by means 

of a common metric to on- level equivalerits? ^ 

2) Will out-of-level testing increase the precision of 
measurement of students who are below the grade ^ 
level for which the test is designed? 

The results of this- study indicate that when ESEA Title I 

students are tested with out-of-level and on-level, CTBS/S Reading 

■ Comprehension subtests, the means obtained when the scores are converted 

to the CTBS/S Expanded Standard Score Scale are reasonably equivalent 

within sampling error. ^ 

* The equating of test means' across grade levels is not as ^ 

important an issue as the' increase in measurement precision obtained 

through testing students at their instructional level. Indeed, the 

chance level, floor and ceiling effects, the test's suitability (i.e., 

Robert ',s criteria) and the medians are indicators of the precision of 

measurement. In tfiis study the out-of-level tests appeared to be 



preferable to on-level tests at the ninth .and* seventh grades because 
the chance level dropped to lower levels with out-of-level testing, 
the on-level tests were shown to be unsuitable using Robert's (1976) 
guidelines and the median number of correct responses was raised. 
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m * 

At the fourth grade level, -where celling effects occurred with the 
out-of-level testing and floor effects occurred with the on-level 
testing, the results could not be equated across gradfe l^^ls because 
the scores were either artificially depressed or inflated. Therefore, 

i 

at the fourth grade level it appears the tests were either unusually 
difficult ^on-level) or exceedingly easy (out-of -level) . 

In conclusion, if a small bias is introduced into the^data 

by testing out-of-level it is perhaps better to accept the small bias 
if it is therefore possible to increase substantially the precisioa of 
measurement. ' ^ 



\ 
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Table 1. Numbers of Students and Levels of the CTBS/S 



Grad^>^^ 




Test Level 


Items 


I 

Ninth • - 


28 


4 


45 


« 




3 


45 


Seventh 


33 


3 


45. 






2 


45 


Fourth 


28 


1 


45 . 






c** 


18 


*The same students 


Were 


tested with the out-of-level 


and 


the on-level test 








**At Level C, students were administered the subtest: 




Reading Comprehension: 


Passages . 




Table 2. CTBS/S Test Levels and Recommended Grades 




/ 








Test Level ' 




Grades 




A 




K.'o - 1.3 




• B 




K.6 - 1.9 




^ c - • 




1.6 - 2.9 




i 




2.5 - 4.9 




2 




4.5 - 6.9 




3* 




6.5 - 8.9 




4 • 




8.5 -12.9 













labile 3. Summary Statistics of the CTBS, Levels C-4 



Grade . 


Test 
Level 


. N 


X 


7oile 


Sta. 


. GE 


Ninth 


4 


•28 


^ 438 


16 


3 


4.6 




3 


28 


-.461 


23 


4 


5.3 


Seventh 


■ 3 


33 


387 


13 


3 


3.4 




2 


33 


402 


16 


3 


3.7 


Fourth 


1 


• 28 


368 


27 


4 


3.1 




C 

>3 


28 


290 


16 


3 


1.9 



18 



18 



Table 4. 


Average Chance 


Scores and Standard 


Deviations 


For 






Investigating Chance Level of Tests 








Grade 


Test Level 


Total Items 




SDj, 




Ninth 


4 


= 45 


11.25 - 


'° +2.90 






3 


45 


11.25 


t-2.90 




Seventh 


3 


>. 45 


11.25 


t^.90 






2 


45 


11.25 


t2.90 


t 


Fourth ' 


1 


45 


11.25 


t2.90 


/ 




C 


18 


.4.50 


tl.84 






V- 













Table 5. Percent of Stude'^nts Scoring at Chance Level-.on the CTBS, 



Grade 


Test Level 


N 


N at 
Chance Level 


Percent 


Ninth 


4 


28 


16 


577. 




3 ■ 


28 


5 


187. 


Seventh 


3 


33 


19 


587. 




2 


33 ' 


10 


307. 


Fourth 


1 \ 


28' 


10 


367, 




C 


28 


1 


47, 



c 
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Table 6. Ceiling and.Rl-opr Effects, Qut-of-Level and dn-Level 
^ . Reading ^Comprehension subtests- of the CTBS,^^. 



19 



Grade * 


Test 
Level 




■ Md , 


X-Md 

•\ 


1/3 SD 


















Nintfh 




42.>8- 


n.5 


1.3 


1.9 


J** 




3 ' 


-21.1- 


^3.5' 


-2.4 


2.7 




Seventh ' 


^ 3 


-13.5 


12.3 


1.2 


1.7 • 






/ 2 


18.5 


18.9 


- .4 


2.2 




Fourth ^ 


*1 


20.9 


17.2 ' 


3.7 ■ 


3.6* 






C 


■ 13.9 


15.9 


-2.0 


1.4** 





^If the inean is lower than, the median by about one-third of a standard 
deviation, a ceiling effect may have been encountered. If the mean 
is higher than the medi^an by about one-third of a standard deviation,, a 
floor effect may have been encountered (Roberts 1976). 



^The statistics have been computed from raw scores. 
*Floor effects • 
**Cciling effects. ' 



Table 7. The Suitability of the Out-of-Level and On-Level Reading Comprehension 
Subtests of the CTBS.^ * 



Grade 


Test 
Level 


b 

X 


Maxlomm 
Score 


Interval 


Suitable 














Ninth ^ 


4 


12.8 


45 


15.0-33.7 


No 




3 


21.1 


45 


15,0-33.7 


Yes 


Seventh 


3 


13.5 


45 


15.0-33.7 


No 




2 


18.5 


• 45 


15.0-33.7 


Yes 


Fourth 


1 


20.9 


45 * 


15.0-33.7 


Yes 




- C 


13.9 


18 


6.0-13.5 


No 


a - ^ 
The level of a 


test is 


suitable when 


the raw score 


mean Is equal to 


or above 



one-third of the maximum score and somewhat less than three-quarters of the 
maximum score (Roberts 1976) . 

The statistics have been computed from raw scores. 
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Table 8. Medians and Reliability Criteria fbr Title I Students ^i,. 













Orade 


Test Level 


Total Items 


Crl terion 


Md 


« 

Ninth 


4 


45 ' 


22/5 


> 

11.5 • 


> 


3 


- 45 


22.5 


2-3.5 


Seventh 


3 


45 . . 


" 22.5 


12.3 . ' 




2 


45 


22.? • 


18.9 


Fourth ' 


1 


45 


2-2.5 


, 17.2'-- 




C 


18 


<9.0' 


13.9 . 


lable V, 


] 

T-Tests Comparing Out-of-Level Tests with On-bevel 


r 

Tests 




Using Expanded Standard Scores. 






Grade 


df Xout 


Xon Differenc^ 




P 


Ninth 


27 461.0 


438.3 22.7 


' 1.18 


.247 • ^\ 


Seventh 


32 402 . 1 
* 


387.0 15.1 


1.32 ' 


.195 


Fourth' 


"27 290.5 


367.6 -77.1 


-7.29- 


.000 • 



Table 10. Correlations of Out-of-Level Test Scores with On-Level Test 
> Scores Using Expanded Standard Scores. 



Grade 



Ninth 


28 


• 


.,10 


J" 

\ 


.095 




Seventh 


33 


.44 


.19 




.011 


* 


Fourth 


> 28 


.67 


.48 




.011 
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* Appendix" A . ^ ^ • 

Number of IL^ems in Each Item Classification for the Reading Comprehension Sflbtest, 
Levels 1-4 (CTBS/S Test Coordinators Handbook 1976) 



Process-Content 
* Category 


Level 1 . 


Level 2 


Level 3 


Level 4 


1. Recognition/Application 










Literal Req^ll 


12 . 


8 


4 


3 


/ 










2. Translation 










Rewording 


8 


. 6 


6 


6 


. Context Ctues 


2 


5 


7 


4 


3. Interpretation 










Main Idea 


8 ' 


5 


• 6 




Descriptive Words 




4 


4 . 


3 


Conclusions 


11 


8 


1.2 


11 


4, Analysis 






■*^ 




Structure/Style 




9 


- 6 


^ 9 


e 

Total 




45 , 


45 


^5 



^' Appendix B f 

Descriptive Statistics Using Raw Scores and Expanded Standard Scores 



A, Raw Scores 



Grade 


Test 
Level 


N 




X 


Md 


• 

SD 
















Ninth . . 




28 




12.8' 


11.5 


5.7 ~ 




3 - ; 


28 




21.1 


. 23.5 


8.1 


Seventh 


3 


33 




13.5 


12,3 


5.0 




2 


33 




18.5 


18.9 


6.5 


Fourth 


1 


28 




20.9 


17.2 


10.7 




C 


28 




13.9 


• 15.9 


. 4.3 



I 



Expanded Standard Scores 



Grade 


Test 
Level 




X 


Mil' 


SD 


Ninth 




28 


438*. 3 


424.0 


87.'0 • 




3 


28 


461.0 


489.5 


86.9 


Seventh 


3 ^ 


33 


386.9 


377.7 


62.1" 




2, 


33 


402.1 


416.2 


61.7 


Fourth 


1 


2^ 


367.6 


: 355.8 


70.4 




C 


28 


290.5 


292.7 


27.2 
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Appendix. Q 



Correlated T-Tes^ts Comi^aring Dut-ofi-Level Tests 
With On-Level Te.sts of, the CTBS Using Raw Scores and Expanded Standard Scores 



A. Raw Scores 



Grade 




^out 


on 


Dif^ence 


t 


P 


Ninth 

Seventh 

Fourth 


27 

- 32 
27 


18.5 
13.9 


13.5 


/ ft 


3 • J i 

4.59. 
-4 .40 


• uuu 
J .000 
.000 








• 




















• 


B. Expanded Standard Scores 










Grade 


df 


^out 


on 


Difference 


t 


P 


Ninth 

Sd^enth 

FoOrth 


27 

32 ^ 
27 


461.0 
402.1 
290.5 

• S 


438.3 
386.9 
367.6 


• 

22.7 
15.2 
-77.1 ^ 

* 


1.18 
1.32 
-7.29 


.247 • 

.195 

.000 

^ 



4 
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Correlations of Out- 
Usin'g Raw Scores 


of-Level Tests with On- Level -Tests 
and Expanded Standard Scores 


« 


A. Raw- 


Scores 








Grade 


N • 




■ r2' '> ~ 


p* 


Ninth- 
Seventh 
Fourth 


' 28 
33 
28 


.314 
.443 
• .676. 

t> 


.098 
.196 
.457 


.103 
.010 
.000 












s 

B. Expanded Standafd Scores 




•■ 


/ ' • . 


Grade 


* r 'N 


t 




1 

p* \ 


Ninth 

Seventh 

Fourth 


* 

28 
■ 33 
28 • 


\ ^321 
\ .435 
.669 


.103 
. .189 
' .447'' 


.095 A ^ 
, ^011*^^ 
^ .000 


*two-tailed probability 

> 
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